Fast hashing of variable-length text strings
نویسندگان
چکیده
منابع مشابه
The universality of iterated hashing over variable-length strings
Iterated hash functions process strings recursively, one character at a time. At each iteration, they compute a new hash value from the preceding hash value and the next character. We prove that iterated hashing can be pairwise independent, but never 3wise independent. We show that it can be almost universal over strings much longer than the number of hash values; we bound the maximal string le...
متن کاملVariable-Length Hashing
Hashing has emerged as a popular technique for large-scale similarity search. Most learning-based hashing methods generate compact yet correlated hash codes. However, this redundancy is storage-inefficient. Hence we propose a lossless variable-length hashing (VLH) method that is both storageand search-efficient. Storage efficiency is achieved by converting the fixed-length hash code into a vari...
متن کاملMatching a Set of Strings with Variable Length Don't Cares
Given an alphabet A, a pattern p is a sequence (v l , . . . ,vm) of words from A* called keywords. We represent p as a single word vl@.. 9 @vm, where @ ~ A is a distinguished symbol called variable length don't care symbol. Pattern p is said to match a text t E A* if t = UoVlUa...um_xv,~u,~ for some u0 , . . . ,u ,~ E A' . In this paper we address the following problem: given a set P of pattern...
متن کاملUsing Simple Recurrent Networks to Learn Fixed-Length Representations of Variable-Length Strings
Four connectionist models are reported that learn static representations of variable-length strings using a novel autosequencer architecture. These representations were learned as plans for a simple recurrent network to regenerate a given input sequence. Results showed that the autosequencer can be used to address the dispersion problem because the positions and identities of letters in a strin...
متن کاملText Mining Using Markov Chains of Variable Length
When dealing with knowledge federation over text documents one has to gure out whether or not documents are related by context. A new approach is proposed to solve this problem. This leads to the design of a new search engine for literature research and related problems. The idea is that one has already some documents of interest. These documents are taken as input. Then all documents known to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Communications of the ACM
سال: 1990
ISSN: 0001-0782,1557-7317
DOI: 10.1145/78973.78978